Author Profiling for English Emails

نویسندگان

  • Dominique Estival
  • Tanja Gaustad
  • Son Bao Pham
  • Will Radford
  • Ben Hutchinson
چکیده

This paper reports on some aspects of a project aimed at automating the analysis of texts for the purpose of author profiling and identification. The complete analysis provides probabilities for the author’s basic demographic traits (gender, age, geographic origin, level of education and native language) as well as for five psychometric traits. We describe the email data which was collected for the project, the ways this data is processed and analysed, and the experimental setup used for classification with the Text Attribution Tool (TAT) before presenting our results for the demographic and psychometric traits using English email. Results are very promising for all ten traits examined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TAT: An Author Profiling Tool with Application to Arabic Emails

This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors of Arabic emails. The TAT system has been developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. We describe the overall TAT system and the Machine Learning experiments resulting in classifiers for the different author t...

متن کامل

Investigating Non-Native English Speaking Graduate Students’ Pragmatic Development in Requestive Emails

The present study investigated learners’ interlanguage pragmatic development through analysis of 99 requestive emails addressed to a faculty member over a period of up to two years. Most previous studies mainly investigated how non-native English speaking students’ (NNESs) pragmalinguistic and sociopragmatic competence differed from native English speaking students (NESs) and compared learners ...

متن کامل

A Critical Analysis of Financial Fraud Spam in English in Terms of Persuasive Strategies: Personalization, Presupposition, and Lexical Choices

The term ‘spam’ addresses unsolicited emails sent in bulk; therefore, the term‘financial fraud spam’ refers to unwanted bulk emails in which different tricks and techniques areemployed to swindle money from the recipients. Estimates show that more than 80% of worldwideemail traffic in 2011 was spam. It should be noted that while the number of daily spam emails in2002 was 2.4 billion, this numbe...

متن کامل

Politeness in Emails Exchanged between English and Persian Speakers

Nowadays, intercultural communication via email among various groups and societies has been increasingly important as an aspect of communication. This research aims at investigating aspects of politeness meaning negotiation via emails exchanged between English and Persian speakers with different cultural backgrounds. The present study also reveals the potentials for using emails to experience c...

متن کامل

The Effect of CMC in Business Emails in Lingua Franca: Discourse Features and Misunderstandings

The paper argues that everyday exchange of business emails produces a development in the work-group relationship, which, in turn, makes new communication styles possible and acceptable by the users' habit to computer-mediated forms, even in unbalanced professional exchanges. The focus is on the (spoken) discourse features of email messages in a self-compiled corpus of selected computer-mediated...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007